16 research outputs found

    Efficient posterior sampling for high-dimensional imbalanced logistic regression

    Full text link
    High-dimensional data are routinely collected in many areas. We are particularly interested in Bayesian classification models in which one or more variables are imbalanced. Current Markov chain Monte Carlo algorithms for posterior computation are inefficient as nn and/or pp increase due to worsening time per step and mixing rates. One strategy is to use a gradient-based sampler to improve mixing while using data sub-samples to reduce per-step computational complexity. However, usual sub-sampling breaks down when applied to imbalanced data. Instead, we generalize piece-wise deterministic Markov chain Monte Carlo algorithms to include importance-weighted and mini-batch sub-sampling. These approaches maintain the correct stationary distribution with arbitrarily small sub-samples, and substantially outperform current competitors. We provide theoretical support and illustrate gains in simulated and real data applications.Comment: 4 figure

    Posterior computation with the Gibbs zig-zag sampler

    Get PDF
    An intriguing new class of piecewise deterministic Markov processes (PDMPs) has recently been proposed as an alternative to Markov chain Monte Carlo (MCMC). In order to facilitate the application to a larger class of problems, we propose a new class of PDMPs termed Gibbs zig-zag samplers, which allow parameters to be updated in blocks with a zig-zag sampler applied to certain parameters and traditional MCMC-style updates to others. We demonstrate the flexibility of this framework on posterior sampling for logistic models with shrinkage priors for high-dimensional regression and random effects and provide conditions for geometric ergodicity and the validity of a central limit theorem.Comment: 29 pages, 4 figure

    Clustering Multiple Sclerosis Medication Sequence Data with Mixture Markov Chain Analysis with covariates using Multiple Simplex Constrained Optimization Routine (MSiCOR)

    Full text link
    Multiple sclerosis (MS) is an autoimmune disease of the central nervous system that causes neurodegeneration. While disease-modifying therapies (DMTs) reduce inflammatory disease activity and delay worsening disability in MS, there are significantly varying treatment responses across people with MS (pwMS). pwMS often receive serial monotherapies of DMTs. Here, we propose a novel method to cluster pwMS according to the sequence of DMT prescriptions and associated clinical features (covariates). This is achieved via a mixture Markov chain analysis with covariates, where the sequence of prescribed DMTs for each patient is modeled as a Markov chain. Given the computational challenges to maximize the mixture likelihood on the constrained parameter space, we develop a pattern search-based global optimization technique which can optimize any objective function on a collection of simplexes and shown to outperform other related global optimization techniques. In simulation experiments, the proposed method is shown to outperform the Expectation-Maximization (EM) algorithm based method for clustering sequence data without covariates. Based on the analysis, we divided MS patients into 3 clusters: inferon-beta dominated, multi-DMTs, and natalizumab dominated. Further cluster-specific summaries of relevant covariates indicate patient differences among the clusters. This method may guide the DMT prescription sequence based on clinical features

    SEQUENTIAL MONTE CARLO FOR BAYESIAN INFERENCE AND DATA ASSIMILATION

    No full text
    Ph.DDOCTOR OF PHILOSOPH
    corecore